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ABSTRACT 

A method of improving the compression of image data using Lempel-Ziv-based 
coding is presented. Image data is first processed with a simple transform, such as 
the Walsh Hadamard Transform, to produce subbands. The subbanded data can be 
rounded to eight bits or it can be quantized for higher compression at the cost of 
some reduction in the quality of the reconstructed image. The data is then run- 
length coded to take advantage of the large runs of zeros produced by quantization. 
Compression results are presented and contrasted with a subband compression meth- 
od using quantization followed by run-length coding and Huffman coding. The Lem- 
pel-Ziv-based coding in conjunction with run-length coding produces the best com- 
pression results at the same reconstruction quality (compared with the Huffman- 
based coding) on the image data used. 

QUANTIZATION-BASED LOSSY COMPRESSION 

A typical compression coding scheme for subbanded data uses run-length and 
Huffman coders on quantized data [1, 2, 3]. This is also the approach used in the 


JPEG method for coding of the high frequency DCT coefficients. Statistical coders 
such as these should do well with data that has large peaks in their histograms at 
zero like those of the higher bands in subbanded data. The improvement in com- 
pressibility from this method comes from the quantization. Quantization maps 
(replaces) a range of values (in a "bin") onto one quantization value, reducing the 
variability of the data by restricting the number of possible values to a small number. 
The rounding of values to eight bits is actually quantization with small bin sizes. 

By coarsely quantizing the data, some noise is removed along with some informa- 
tion, which improves the compression. With coarser quantization, the compression 
improves, but at a cost of added distortion to the reconstructed image. The key area 
for coarse quantization of subbands is the region around zero. Because of the peak 
of the histogram of a subband at zero, a deadband around zero will quantize more 
values to zero providing longer run-lengths at a cost of somewhat more distortion. 

Quantization is the key difference between lossy and lossless coding. After quant- 
ization, compression is obtained by using lossless coders, such as run-length and 
Huffman coders. The loss all comes from the quantization stage. 

This paper will present results from a subband compression approach to see if 
good lossy compression ratios can be obtained with LZ-based coding. The LZ-based 
coder is a public domain software program used on personal computers for general 
purpose text file compression and archiving (LHa by "Yoshi"). 
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QUANTIZER SELECTION 


Variations possible in quantizers include adaptive vs. fixed, midrise vs. midtread, 
symmetric vs. non-symmetric, uniform bin size vs. non-uniform, centered quantization 
values vs. centroid of pdf, bin size, deadband size, and threshold value. The type of 
quantizer that should be used can be deduced by looking at the histograms of sub- 
bands. These histograms have a peak around zero for all but the lowest band. 
Because of the basic similarities of the histograms of various images’ subbands, 
adaptive quantizers will not be considered here. 

To prepare the data for a run-length coder, we desire a lot of zero values. Be- 
cause of the large number of subband values around zero, the type of quantizer that 
will provide a lot of zero values is a midtread quantizer (having a quantization bin 
with zero at the center). Because of the symmetry of the histograms, a symmetric 
quantizer around zero is also appropriate. The small probability of large values in 
the subband would suggest a non-uniform quantizer that provides larger bin sizes at 
higher values. 

The quantization bin around zero is called a deadband. If a uniform quantizer 
was used with a large bin size (e.g., 32), then a deadband smaller than the uniform 
bin size may be necessary to minimize the difference between the reconstructed pixel 
value and the original pixel value. The size of a bin or deadband will affect the 
amount of distortion in the reconstructed image. The maximum error for a value in 
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a particular bin is half the bin size for a quantizer with a centered quantization value. 
For a non-centered quantization value, the maximum possible error for any particular 
quantized value is larger, although the total error for all values may be lower. This 
raises the question: is it better to have fewer large errors or lots of smaller errors? 
Up to a certain bin size it is obviously better to have lots of smaller errors because 
those errors will not be noticeable. For example, a lot of errors of one count per 
pixel in an image will not be noticeable at all. Also, a large error in a high frequen- 
cy region of the image should not be as serious as one in a low frequency area 
because of Human Visual System (HVS) masking, unless the high frequency is a lone 
edge where artifacts can be very noticeable. 

A threshold is not appropriate for subbands because of the large errors that can 
be introduced. Even though a large value in a subband is very rare, the effect of 
clipping it off with a threshold can be noticeable. Large values occur at light/dark 
boundaries or edges, and the HVS is sensitive to noise near edges. Many images do 
not have any values in the subbands greater than a certain threshold, so the tempta- 
tion is there to put one in since it will not degrade the test images at all. Bins at 
large values can be maintained at low cost because if they are not used their quanti- 
zation values can be effectively removed with an entropy coder after the run-length 
coding. 

There are four quantizer designs that will be used in this research: 1) a fine 
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quantizer for the DPCM coding of the low band, 2) a fine quantizer for the subbands 
of high-quality reconstructions for scientific applications, 3) a coarse quantizer for the 
mid-bands of an entertainment-quality reconstruction, and 4) a very coarse quantizer 
for the highest band of the entertainment-quality reconstruction. 

QUANTIZER DESIGN 

Now that a midtread, non-uniform, symmetric quantizer has been selected, it 
remains to define the bins and the quantization values of each bin. To simplify the 
design somewhat, we can divide the design into three sections: 1) the deadband, 2) 
the low (near-zero) bins, and 3) the high (away from zero) bins. The quantizer will 
be applied to subbanded image data that has not been scaled or rounded to eight bit 
values, for example, 10 bit values for a four-band Walsh-Hadamard transform. If the 
subband values were rounded to eight bits before quantizing, additional distortions 
would be introduced. This is because rounding to eight bits is a uniform quantiza- 
tion, and a two-stage quantization will introduce additional distortion unless the bin 
boundaries for the second stage exactly match a subset of the bin boundaries for the 
first stage. 

The deadband design is simply a matter of selecting the bin size since the quanti- 
zation value will obviously be zero. A large bin size will result in longer runs of 
zeros and in increased distortion in the reconstructed image. A smaller bin size will 
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result in fewer zero quantization values and in better reconstruction. The design 
trade is to make the bin as large as possible without introducing noticeable distortion 
due to quantization. 

The low bins seem to fall naturally between _±32 looking at the histograms of 
subbands. A bin size comparable to the deadband size may be appropriate. The 
quantization value for the low bins should be somewhat closer to zero than the 
center of the bin because of the curve of the histogram in the bin, at least for the 
bins nearest the deadband. The optimum place would be the centroid of the histo- 
gram in the bin, but that value will change from image to image. Since the histo- 
gram curve flattens out as it gets away from zero, it may not be worth the trouble to 
move the quantization value from the center for bins farther out. 

Looking at the values beyond _±32, large bins with centered quantization values 
are probably sufficient because there are not many values in any particular quanti- 
zation bin, so the contribution to quantization noise by having a value at the center 
of the bin rather than at the centroid will be small. 

For the DPCM quantizer, the number of bins is 31 with a deadband from -2 to 
+2 (see Table I). The fine quantizer has 63 bins with a deadband of -3 to +3. The 
two coarse quantizers have a deadband of -7 to + 7, one with 7 and one with 15 bins. 
The quantizers generally have smaller bins near zero compared to bins away from 
zero since most subband values are expected to be near zero. The fine quantizer has 
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a maximum bin size of nine with a quantization value at the center of the bin. Thus, 
no quantized value changes from its original value by more than four counts. The 
non-uniform quantizers used here are really made up of a couple of uniform quanti- 
zers with larger bin sizes used for the more extreme values. The subbanded data was 
processed such that the range of raw values was -255 to 255. This was accomplished 
by combining the transform scaling factor for analysis and synthesis into one scaling 
factor for analysis of 1/4. 

RUN LENGTH CODER DESIGN 

The run length coder for quantized subband values can be designed to take 
advantage of the structure of the data that we expect from the quantizer. The data 
should consist of many runs of zeros with some very long runs where there is little 
spatial high frequency information. The number of different non-zero values will be 
the same as the number of bins (less the deadband) in the quantizer, which should 
be considerably less than the number of possible values in the unquantized data. 
There will be runs of non-zero values also, but these will not be as long as the zero 
value runs. 

To take advantage of this structure, the run length coder has been designed to 
encode the subbands into one or two byte long codewords representing runs of zeros 
or of up to sixteen different quantized values. This run length coder maintains byte- 
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sized codewords which simplifies handling of the data somewhat. The first bit of the 
codeword determines whether it represents a run of zeros or a run of non-zero 
values. Runs of zeros are coded with one or two bytes, while runs of non-zeros are 
coded with one byte only. The second bit in a codeword that represents a run of 
zeros indicates whether the length of the codeword is one or two bytes long. The 
remaining bits are the length of the run of zeros (up to 64 for a one byte codeword, 
and up to 16448 for a two byte codeword). 

If the codeword represents a run of non-zero values, then four bits of the code- 
word represent the bin identification and the remaining three bits represent the 
length of the run (up to eight). The non-zero codewords can handle up to sixteen 
quantization bins with a run length of one to eight. The codewords use the following 
format: 

one bvte zero codeword 

b7 b6 b5 b4 b3 b2 bl bO 
00 RRRRRR 

two bvte zero codewords 


bl5 

bl4 

bl3 

bl2 

bll 

blO 

b9 

b8 

0 

1 

R 

R 

R 

R 

R 

R 
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bl 

b6 

b5 

b4 

b3 

b2 

bl 

bO 

R 

R 

R 

R 

R 

R 

R 

R 


non-zero. 16-bin codeword 




bl b6 

b5 b4 b3 

b2 

bl 

bO 

1 B 

B B B 

R 

R 

R 


where: B indicates bin identifying bits 

R indicates run length bits 

Because the high band quantization has 63 bins, the run length coder was modi- 
fied for use with high band data to work with 64 bins. The change to increase the 
number of bins reduced the length of runs that can be coded to a maximum of two. 
The non-zero codewords for the 64 bin version follow the format below: 
non-zero. 64-bin codeword 

bl b6 b5 b4 b3 b2 bl bO 

IB B B B B B R 
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LOWEST BAND CODING 


The classic approach of Gharavi and Tabatabai [1] uses a two-dimensional 
DPCM coder for the low band and a quantizer/ run-length coder for the upper bands. 
The DPCM coder uses a third-order predictor using three previously decoded pixels, 
x = 0.5A + 0.25B + 0.25C, where x is the prediction, A is the previous horizontal 
pixel, B is the previous vertical pixel, and C is the previous diagonal pixel following 
B. In [1], the differential signal is quantized with 31 levels, symmetric, non-uniform 
quantization followed by a variable length coder. 

The DPCM predictor from [1] will be used in this work, but with a different 
quantizer and entropy coder. The quantizer has a deadzone of _±2 (following [2]), 
and bin sizes of 5 (low bins) and 23 (bins above 13) with no upper threshold. After 
quantization, an adaptive Huffman coder or LZ coder is used to provide compres- 
sion. Table III gives the results for the four test images. The LZ-based coder results 
are better than the adaptive Huffman coder’s for three of the four images. The 
image where the adaptive Huffman does better is the Baboon image where the result 
is about 10% better than for LZ. Run-length coding could be used before the statis- 
tical coders, but the added complexity was not justified by the small improvement in 
compression. 

The low band coding determines the overall compression achieved because 
it is by far the hardest band to compress. The low band has nearly all of the signal 
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energy of the original, and so is the biggest challenge to code. A high quality low 
band is required for good reconstruction. 

The basic reconstruction quality possible with a given low band coding scheme 
can be estimated by using the low band alone to make a reconstruction. For a four 
band split, this can be done by doubling the horizontal and vertical lines of data to 
obtain a reconstructed image the same size as the original (basically by "zooming in"). 
The zoomed low band was used to give the base reconstructed PSNR values given 
in Table II. 

If better compression ratios were desired in the following research, then 
improving the low band coding would be the place to start. A very good fidelity low 
band coder was used in this research because the interest here is in the coding of the 
higher subbands. The same low band coder was used in both the fine and coarse 
cases so that its effect on the results would be negligible. Better compression ratios 
can be achieved by trading more distortion in the reconstructed image. A larger 
deadband and coarser quantization of the DPCM data would be a place to start. 
Absolute compression ratios were not the goal of this research, rather a comparison 
of compression approaches was undertaken. 
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COMPRESSION RESULTS 


The coding scheme described above was applied to subbanded image data 
from four test images. The Walsh-Hadamard transform was used to generate four 
bands for each image. 

The resulting compression using the lossy technique is very good for entertain- 
ment quality images such as would be used for HDTV. Entertainment quality is the 
result of using the coarse quantizers. Table III contrasts the results for both fine and 
coarse quantizers resulting in high quality and entertainment quality reconstructions 
respectively. The compression ratio shown is for run-length followed by LZ-based 
coding. 

The coarse quantization provided about a 50% improvement over the fine 
quantization in this case. The Baboon image proved hardest to compress because 
of its noise-like high frequency information. The noise-like nature of the image 
makes a lower quality reconstruction more tolerable, however. An easy improvement 
in compression without noticeable affect on quality can be obtained by dropping the 
high band completely, which results in a compression ratio of 3.4:1 for the fine 
quantizer and 5.3:1 for the coarse quantizer. The Baboon image is a nice one to use 
for testing compression because of the challenge of compressing the high frequency 
content, but not so good for finding distortion which is masked by the high frequen- 
cies. 
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LZ and adaptive Huffman coding are compared in Table IV. Adaptive Huff- 
man coding was used to avoid the overhead incurred in transmitting the Huffman 
tree for every image. The comparison is between the higher bands of the test images 
in a four band split. In both cases the same quantizers and run-length coders are 
used, the difference is in the final coding stage. The LZ-based coder beats the 
Huffman coder in 19 out of 24 cases, sometimes by a factor of over 100. In the five 
cases where the Huffman coder outperformed LZ, the improvement was only around 
10%. This occurred in images with lots of high frequency content (i.e., Baboon) 
which does not fit well with the model used by LZ coding. The surprising result is 
that the LZ-based coder works very well as a statistical coder for image data and that 
quantized, subbanded image data is generally well compressed using LZ. LZ-based 
coding also generally provided some improvement in compression for data that had 
already been Huffman coded. 
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CONCLUSION 


The use of a Lempel-Ziv-based coder as a statistical coder for subbanded 
image data is very promising. Simple subbanding schemes can be used to prepare 
image data for compression by a text coder. This allows the use of commonly avail- 
able archiving programs for compression of documents that include text and image 
data. 
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TABLE I 


QUANTIZERS 


DPCM 
31 bins ' 

FINE 
63 bins 

COARSE 1 
(MID BANDS) 
15 bins 

COARSE 2 
(HIGH BANDS) 
7 bins 

BIN 

RANGE 

VALUE 

BIN VALUE 

RANGE 

BIN 

RANGE 

VALUE 

BIN 

RANGE 

VALUE 

-2-2 

0 

-3-3 

0 

-7-7 

0 

-7-7 

0 

3-7 

5 

4-7 

5 

8-31 

20 

8-63 

20 

8-12 

10 

8-12 

10 

32-61 

41 

64-190 

127 

13-25 

17 

13-17 

15 

62-102 

82 

191-255 

254 

26-42 

34 

18-22 

20 

103-143 

123 



43-59 

51 

23-27 

25 

144-184 

164 



60-76 

68 

28-31 

30 

185-225 

205 



77-93 

85 

32-40 

36 

226-255 

246 



94-110 

102 

41-49 

45 





111-127 

119 

50-58 

54 





128-144 

136 

59-67 

63 





145-161 

153 

68-76 

72 





162-178 

170 

77-85 

81 





179-195 

187 

86-94 

90 





196-220 

212 

95-103 

99 





221-255 

255 

104-112 

108 







113-121 

117 







122-130 

126 







131-139 

135 







140-148 

144 







149-157 

153 







158-166 

162 







167-175 

171 







176-184 

180 







185-193 

189 







194-202 

198 







203-211 

207 







212-220 

216 







221-229 

225 







230-238 

234 







239-247 

243 







248-255 

252 






Note: Quantizers are symmetric around zero. Only positive values are shown. 
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TABLE n 


LOW BAND DPCM COMPRESSION RESULTS 


IMAGE 

FILE SIZE (bytes) 

PSNR 

(dB) 

BASE 

PSNR 

(dB) 

Quantized 

Original 

Adaptive 

Huffman 

LZ-Based 

LENNA 
Low Band 

65,540 

27,723 

15,290 

43.60 

31.20 

BABOON 
Low Band 

65,540 

27,327 

30,489 

38.85 

23.23 

IO 

Low Band 

51,204 

11,850 

10,527 

44.09 

35.09 

JUPITER 
Low Band 

100,804 

44,510 

22,377 

38.96 

31.91 

Original 
LENNA 
512 x 512 

262,148 

73,254 

83,379 

43.51 

43.51 


Notes: 


1. PSNR is calculated relative to the original low band data. 

2. Base PSNR is calculated relative to the full size original image using 
only the low band quantized data for the reconstruction. 





































TABLE III 


LOSSY COMPRESSION RESULTS 

RUN LENGTH AND LZ-BASED CODING OF QUANTIZED, SUBBANDED 
DATA WITH DPCM CODED LOW BAND 


LENNA 

FINE 

QUANTIZATION 

COARSE QUANTI- 
ZATION 

PSNR (dB) 

37.98 

33.78 

Compression Ratio (C.R.) 

6.9 : 1 

11.1 : 1 

BABOON 

FINE 

COARSE 

PSNR 

35.68 

28.77 

C.R. 

2.7 : 1 

4.3 : 1 

IO 

FINE 

COARSE 

PSNR 

40.33 

36.34 

C.R. 

10.1 : 1 

15.0 : 1 

JUPITER 

FINE 

COARSE 

PSNR 

36.27 

32.73 

C.R. 

6.9 : 1 

12.5 : 1 
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LZ-BASED VS. ADAPTIVE HUFFMAN COMPRESSION 
OF QUANTIZED SUBBANDS 
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LZ-BASED VS. ADAPTIVE HUFFMAN COMPRESSION 
OF QUANTIZED SUBBANDS 
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